Summary Report of My Scientific Activities during ERCIM Postdoc fellowship at NTNU
نویسنده
چکیده
Motif discovery is a crucial part of regulatory network identification, and therefore widely studied in the literature. Motif discovery programs search for statistically significant, well-conserved and over-represented patterns in given promoter sequences. When gene expression data is available, there are mainly three paradigms for motif discovery; clusterfirst, regression, and joint probabilistic. The success of motif discovery depends highly on the homogeneity of input sequences, regardless of paradigm employed. In this work, we propose a methodology for getting homogenous subsets from input sequences for increased motif discovery performance. It is a unification of cluster-first and regression paradigms based on iterative cluster re-assignment. The experimental results show the effectiveness of the methodology. 5 Status: Accepted for publication in the Proceedings of Computational Systems Bioinformatics (CSB 2006) conference, August 14-18, 2006, Stanford University, California. 4.2 Paper 2 Title: TScan: A Two-step De novo Motif Discovery Method. Authors: Osman Abul, Geir Kjetil Sandve, and Finn Drabløs Abstract: Computational discovery of novel motifs in biological sequences is an important and well-studied problem. The key to motif discovery methods, either de novo or library based, is having well-defined scoring functions. Several different scalar valued scoring functions have been proposed that measure some notion of biological motifs; that is we lack a perfect one capable of measuring of all notions together. In this work, we propose a two-step de novo motif discovery paradigm employing two scoring functions measuring different notions of biological relevance. We define a word-counting based method, called TScan, taking this paradigm. It is mainly inspired from MDScan, but does not require supplementary ChIP-chip data. Our results on seven data sets from a recent study are promising, with discovered motifs agreeing well with the consensus motifs defined for the data sets. Computational discovery of novel motifs in biological sequences is an important and well-studied problem. The key to motif discovery methods, either de novo or library based, is having well-defined scoring functions. Several different scalar valued scoring functions have been proposed that measure some notion of biological motifs; that is we lack a perfect one capable of measuring of all notions together. In this work, we propose a two-step de novo motif discovery paradigm employing two scoring functions measuring different notions of biological relevance. We define a word-counting based method, called TScan, taking this paradigm. It is mainly inspired from MDScan, but does not require supplementary ChIP-chip data. Our results on seven data sets from a recent study are promising, with discovered motifs agreeing well with the consensus motifs defined for the data sets. Status: Submitted for publication to The Third Annual RECOMB Satellite Workshop on Regulatory Genomics conference, July 17-18, 2006, National University of Singapore, Singapore. 4.3 Paper 3 Title: Bias Analysis of Motif Models for Biosequences. Authors: Geir Kjetil Sandve, Osman Abul, Vegard Walseng, and Finn Drabløs Abstract: The discovery of motifs in biosequences is an important problem and has in recent years attracted much research interest, resulting in more than hundred tools. The The discovery of motifs in biosequences is an important problem and has in recent years attracted much research interest, resulting in more than hundred tools. The
منابع مشابه
Ercim " Alain Bensoussan " Fellowship Programme Scientific Report
During my fellowship, I collaborated with Prof. Ole Morten Aamo and Ass. Prof. Øyvind Stavdahl. One of the objective of the project was the development and investigation of methods for a computer based video sequence analysis in order to detect a certain kind of General Movements (GM) observed in the motion performed by infants with the aim to determine the risk to have Cerebral Palsy. Cerebral...
متن کاملFinal Report – 2015 - 2016 Medical Informatics Postdoctoral Research
This report provides a summary for the various projects I undertook during my one year fellowship. My primary research activity was with the Medical Ontology Group (MOR) under the guidance of Dr. Bodenreider. The research with MOR focused on interoperability of biomedical terminologies and ontologies, evaluating the semantic aspects, such as coverage, completeness and alignment. Specifically, I...
متن کاملAnnual Report of Antoine LEMENANT
During the year 2009/2010 I was principally working on 3 different problems in collaboration with 4 different people, 3 of them working in Pisa and one in Parma. Those works led to 3 scientific papers and 1 review paper submitted during the year 2009/2010. The first work is a collaboration with Pablo Àlvarez-Caudevilla, postdoc at the center De Giorgi. We studied a problem in PDEs, about the as...
متن کاملSerious work on playful experiences: a preliminary set of challenges
1 This work was carried out during the tenure of an ERCIM “Alain Bensoussan” Fellowship Programme ABSTRACT As work, leisure and social activities blend together, and amateur and professional practices becomes harder to distinguish, we need to explore the role of technology that works to support people in this rich range of everyday experiences. Incorporating fun, playful elements in the workpla...
متن کاملDecoding Choice Encodings
We study two encodings of the asynchronous π-calculus with input-guarded choice into its choice-free fragment. One encoding is divergence-free, but refines the atomic commitment of choice into gradual commitment. The other preserves atomicity, but introduces divergence. The divergent encoding is fully abstract with respect to weak bisimulation, but the more natural divergence-free encoding is n...
متن کامل